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MicroRNAs (miRNAs) are small non-coding RNAs of 
^20nt in length that are capable of modulating gene 
expression post-transcriptionally. Although miRNAs have 
been implicated in cancer, including breast cancer, the 
regulation of miRNA transcription and the role of defects 
in this process in cancer is not well understood. In this 
study we have mapped the promoters of 93 breast cancer- 
associated miRNAs, and then looked for associations 
between DNA methylation of 15 of these promoters and 
miRNA expression in breast cancer cells. The miRNA 
promoters with clearest association between DNA methy- 
lation and expression included a previously described and 
a novel promoter of the Hsa-mir-200b cluster. The novel 
promoter of the Hsa-mir-200b cluster, denoted P2, is 
located ^2kb upstream of the 5' stemloop and maps 
within a CpG island. P2 has comparable promoter activity 
to the previously reported promoter (PI), and is able to 
drive the expression of miR-200b in its endogenous 
genomic context. DNA methylation of both PI and P2 
was inversely associated with miR-200b expression in 
eight out of nine breast cancer cell lines, and in vitro 
methylation of both promoters repressed their activity in 
reporter assays. In clinical samples, PI and P2 were 
differentially methylated with methylation inversely asso- 
ciated with miR-200b expression. PI was hypermethy- 
lated in metastatic lymph nodes compared with matched 
primary breast tumours whereas P2 hypermethylation was 
associated with loss of either oestrogen receptor or 
progesterone receptor. Hypomethylation of P2 was 
associated with gain of HER2 and androgen receptor 
expression. These data suggest an association between 
miR-200b regulation and breast cancer subtype and a 
potential use of DNA methylation of miRNA promoters 
as a component of a suite of breast cancer biomarkers. 
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Introduction 

Breast cancer is a heterogeneous disease that can be 
classified on the basis of a number of characteristics 
including tumour size, histological subtype and grade, 
oestrogen (ER) and progesterone (PR) receptor and 
HER2 expression, axillary lymph node (LN) status and 
expression profile (Sorlie et al., 2001). Some of these 
features have been associated with disease character- 
istics and can therefore be used to inform patient 
management. For example, patients with tumours that 
test positive for ER and HER2 can be treated with 
tamoxifen and herceptin, respectively, and have a 
significantly better prognosis than those that test 
negative for these markers. However, the heterogeneity 
that exists even within breast cancer subgroups defined 
by multiple markers means that for the vast majority of 
breast cancer cases, predicting outcome remains a 
challenge, and thus additional informative biomarkers 
are urgently needed. 

Breast cancer results from abnormalities in the quality 
or quantity of certain gene products, including coding 
and non-coding genes. MicroRNAs (miRNAs) are small 
non-coding RNAs ^20nt in length that are capable 
of modulating gene expression post-transcriptionally 
(Cullen, 2004; Boyd, 2008; Bartel, 2009). MiRNAs can 
exhibit either tumour suppressor or oncogenic roles by 
modulating key cellular processes in cell-cycle progres- 
sion, apoptosis and invasion (Bartels and TsongaHs, 
2009; Mirnezami et al., 2009; Visone and Croce, 2009). 
In several studies, differential miRNA expression has 
been shown to distinguish normal and breast tumour 
tissue, breast cancer subtypes, ER, PR and HER2 
status, and to predict lymph node status and invasive- 



ness (lorio et al, 2005; Mattie et al, 2006; Foekens 
et aL, 2008; Yan et aL, 2008; Lowery et aL, 2009). 
Together, these studies suggest a potential diagnostic 
and prognostic use of miRNAs as biomarkers in breast 
cancer. 

Quantitative defects in miRNAs arise through several 
mechanisms, including aberrant DNA methylation. 
Human DNA methylation usually occurs at the number 
5 carbon of cytosine of a CpG dinucleotide motif. High 
densities of CpGs, termed CpG islands (CGIs) are 
usually associated with promoter elements and methyla- 
tion of which usually leads to gene repression. Aberrant 
DNA methylation of miRNA genes has been associated 
with several cancers (Lujambio et aL, 2007; Lehmann 
et al., 2008; Lodygin et aL, 2008), suggesting a possible 
use of miRNA DNA methylation as a prognostic tool. 
For example, miR-9-1 and miR-34a are hypermethy- 
lated in breast cancer (Lehmann et aL, 2008; Lodygin 
et aL, 2008). In addition, the mir-200b cluster (miR- 
200b, -200a and -429), has a CGI associated promoter 
^4kb upstream of the sequence encoding the mature 
miRNA (Bracken et aL, 2008), and aberrant DNA 
methylation of this sequence is associated with loss of 
miR-200 expression in colon (Han et aL, 2007), bladder 
(Wiklund et aL, 2011) and pancreatic (Li et aL, 2010) 
cancers. The contribution of miR-200b cluster gene 
methylation to breast cancer has not yet been reported. 

Although much is known about the biogenesis and 
function of miRNAs, relatively little is known about the 
transcriptional regulation of miRNA genes. To date, a 
limited number of miRNA promoters have been 
experimentally characterized and only recently have 
several miRNA promoter prediction algorithms 
emerged (Zhou et aL, 2007; Fujita and Iba 2008; Linhart 
et aL, 2008; Marson et aL, 2008; Ozsolak et aL, 2008; 
Wang et aL, 2009). These studies show that miRNA 
promoters He anywhere from a few bases upstream of 
the stemloop to tens of kilobases upstream (Linhart 
et aL, 2008; Marson et aL, 2008; Ozsolak et aL, 2008; 
Wang et aL, 2009). Furthermore, several miRNAs have 
multiple promoters (Ozsolak et aL, 2008; Wang et aL, 
2009; Monteys et aL, 2010). 

In this study, 93 miRNAs previously associated with 
breast cancer, were prioritized for experimental analysis 
using bioinformatics to look for CGI-associated promo- 
ters. The CGI-associated promoters of 15 miRNAs were 
mapped and methylation determined in a panel of nine 
breast cancer cell lines. A novel promoter for the 
miR-200b cluster and its role in regulating miR-200b 
expression was investigated. The relationship between 
methylation of this promoter and the previously described 
miR-200b cluster promoter with miRNA-200b expression 
and clinical characteristics in breast cancer are described. 



Results 

Fifty-five miRNAs previously implicated in breast cancer 
are located within 5kb of a predicted CGI 
To identify candidate promoters for which methylation 
could be associated with breast cancer development, a 
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list of 93 miRNAs implicated in breast cancer was 
collated from the literature (Supplementary Table 1). 
Using CpGPlot and CGI searcher, 55 (59%) of these 
miRNAs had a predicted CGI within 5 kb upstream of 
the region encoding their 5' stemloop. The Core 
Boost_HM promoter prediction algorithm was used to 
predict regulatory elements controlling the transcription 
of the 55 miRNA-associated CGIs. This algorithm uses 
PolII chip binding, histone modifications and DNA 
motifs associated with promoters to predict putative 
transcription start sites (TSS) (Wang et aL, 2009). 
Putative promoters were defined by a CpG promoter 
prediction cutoff score of 0.5, representing a 90% 
Hkelihood of TSS within 500 bp of the predicted region. 
Figure 1 summarizes the process. 

Experimental validation of predicted novel promoters of 
15 miRNAs 

To determine if these predicted CpG promoter sequences 
had experimentally detectable promoter activity, a 600- 
1000 bp of genomic sequences around the predicted site 
was cloned upstream of a luciferase gene and assayed for 
reporter activities in either MCF7 or MDA-MBD-231 
breast cancer cell lines. When a miRNA had more than 
one predicted promoter, a fragment encompassing each 
prediction was cloned. Their genomic locations are 
detailed in Figures 2 and 3 and Supplementary Table 
1. The previously described promoters of the miR-17 
cluster (Yan et aL, 2009) and the miR-200b cluster 
(Bracken et aL, 2008) were included as positive controls 
whereas a non-CoreBoost_HM predicted fragment in 
miR-17 cluster was used to control for background 
promoter activity. 

Twenty- two novel promoters from 15 miRNAs 
exhibited at least fivefold activity compared with the 
promoter-less pGL3-basic control in at least one cell Hne 
(Figures 4-6a). As expected, the previously described 



Oncogene 




Mapping the sequences controlling 93 breast cancer-associated miRNA genes 

EJH Wee et a I 



4184 






Figure 2 UCSC screenshots of miRNA candidates and their associated genomic features. Bars representing miRNAs are shown in 
red, CGIs in green, promoter and methylation-sensitive high-resolution meh analysis fragments in black. Annotated genes are marked 
in blue. Orientation of genes and fragments are indicated by directional arrows. CoreBoost_HM promoter predictions are shown as 
black peaks. 



promoters of miR-17 and miR-200b clusters had strong 
promoter activity whereas the non-CoreBoost_HM 
predicted fragment had no detectable promoter function 
(Figures 3d and 4a). The 15 miRNAs with experimen- 
tally validated promoters are miR-9-1, miR-9-3, miR- 
10b, miR-22, miR-124-1, miR-124-2, miR-124-3, 
the miR-130b cluster, miR-193b, miR-200b cluster, 
miR-210, miR-320a, miR-335, miR-373 and miR-663. 
Three promoters were mapped for the miR-124-3 loci; 
two promoters were mapped for miR-9-1, 22, 124-1, 124- 
2, 193b and 200b; and one promoter was mapped for 
miR-9-3, 10b, 130b, 210, 320a, 335, 373 and 663. To map 
the minimal promoters regions and to facilitate methyla- 
tion analysis, promoter fragments of the 15 miRNAs 
were fine mapped to ^300 bp (Figures 2, 3 and 6b). 

The miR-200b cluster P2 promoter is sufficient to drive 
expression of miR-200b 

To determine whether the P2 promoter could drive the 
expression of miR200b in its endogenous genomic 
context, low miR-200b expressing MDA-MB-231 cells 
(Gregory et aL, 2008), were transfected with a miR-200b 
minigene spanning the P2, but not the PI, promoter and 
the sequence corresponding to the mature miR-200. This 
minigene was generated by replacing the luciferase 
coding sequence of pGL3-basic with the miR-200b 
genomic sequence (Figure 6c). The introduction of 
the miR-200b minigene resulted in an eightfold increase 



in mature miR-200b expression over the pGL3-basic 
control (Figure 6c). Deletion of the minimal promoter in 
the minigene reduced miR-200b expression by 50% 
(Figure 6c). Collectively, these results indicate that the 
P2 promoter can regulate miR-200b, and very possibly 
mir-200a and 429, as a polycistronic primary transcript 
(Bracken et aL, 2008). 

The miR-200b cluster PI and P2 promoters are 
independent 

To address the hypothesis that PI and P2 promoters 
function synergistically to enhance expression of the 
miR-200b cluster, a 2.5-kb fragment encompassing both 
promoters was cloned upstream of the luciferase gene 
and assayed for reporter activity in MDA-MB-23 1 cells, 
in which only P2 was observed to be functional, and 
MCF7 cells, in which functional activity was observed 
for both promoters. As predicted, the PI +P2 fragment 
produced similar reporter activity to the P2 fragment 
alone in MDA-MB-231 cells (P = 0.34) (Figure 6a). In 
contrast, in MCF7 cells, the reporter activity of PI +P2 
was not significantly greater than the activity of PI 
(P = 0.4), but was significantly stronger than P2 
(P<0.05) (Figure 6a). Also, the activity of P2 alone 
and PI +P2 were also significantly higher (P<0.05) in 
MCF7 than in MDA-MB-231 cells (Figure 6a). Taken 
together, this data suggests that the PI and P2 
promoters function independently. 
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Figure 3 UCSC screenshots of miRNA candidates and their associated genomic features. Bars representing miRNAs are shown in 
red, CGIs in green, promoter and methylation-sensitive high-resolution meh analysis fragments in black. Annotated genes are marked 
in blue. Orientation of genes and fragments are indicated by directional arrows. CoreBoost_HM promoter predictions are shown as 
black peaks. 



The miR-200b cluster has multiple TSS 
To complement the promoter mapping experiments, 
attempts were made to map the TSS of P2. Classical 5' 
RACE PCR was employed to determine the TSS of P2 
using RNA from MDA-MB-23 1 cells transfected with 
the P2 construct to enrich for P2 derived transcripts. 
However, repeated attempts with the classical 5' RACE 
protocol were unsuccessful, consistently producing non- 
specific smears (data not shown). Successful amplifica- 
tion of template controls indicated that the cDNA 
synthesis had worked and that this result was more 
likely to reflect heterogeneity in miR-200b cluster 
transcripts. An alternative 'PCR walk' approach to 
mapping the TSS was performed using a single 
transcript-specific reverse primer and various forward 
primers toward the 5' end of the cDNA transcript. The 
longest transcript extended from -3032 bp to -2447 bp 
upstream of the 5' stemloop as indicated by loss of PCR 
ampHfication (Figure 6d). This observation was in 
agreement with the both the minigene and the luciferase 
reporter assays. To further test the hypothesis that miR- 
200b has multiple TSS, pubHcly available breast cancer 
specific RNA- Sequence and RNA PolII-chip data were 
analysed at the miR-200b loci (Figure 6e). Multiple 
RNA-sequence peaks were observed along the CGI for 
T47D and MCF7 cells indicating expression from the 
CGI. Furthermore, multiple RNA PolII binding signals 



in MCF7 cells were detected along the associated CGI 
suggesting multiple TSS. A strong RNA PolII signal 
overlapping PI also suggested preferential transcription 
from PI in MCF7 (Figure 6e). 

Methylation of miR-200b cluster and miR-335 promoters 
is associated with reduced miRNA expression 
DNA methylation of the minimal promoters of the 15 
miRNA was assessed by methylation-sensitive high- 
resolution melt analysis (Wojdacz and Dobrovic, 2007) 
in a panel of nine breast cancer cell lines. The proximal 
miR-9-1 promoter was not included as methylation of 
this promoter had been previously described (Lehmann 
et ai, 2008). The miR-17 cluster promoter was also 
excluded because the high density of CG dinucleotides 
made it unsuitable for methylation-sensitive high- 
resolution melt analysis. Promoter methylation was 
then compared with miRNA expression in the same 
cell lines. Mir-200b cluster and miR-335 promoter 
methylation were inversely associated with miRNA 
expression. For miR-200b cluster, eight out of the nine 
cell lines displayed an inverse association (Figure 7a, 
Supplementary Figure 2). Although only MCF7 highly 
expressed miR-335, MCF7 also had the lowest methyla- 
tion compared with the remaining eight, which were 
fully methylated and had minimal miR-335 expression 
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Figure 4 (a-i) Promoter activities of miRNA candidates. miRNA promoter activity in cells expressed in RLU ± the s.e.m. Data were 
generated from three independent experiments. Promoter fragments are labelled 1, 2 or 3 and A, B or C indicates the sub-fragment of 
that respective promoter. 



(Supplementary Figure lA). However, since the inverse 
association was stronger in miR-200b, PI and P2 
promoters represented better candidates for further 
analysis. In contrast, the miR-210 and miR-320a 
promoters were unmethylated in all nine cell lines 
although DNA methylation was not associated with 
miRNA expression for miR-9, miR-lOb, miR-124, miR- 
373 and miR-663 (Supplementary Figure 1). 



The minimal miR-200b cluster promoters are regulated by 
DNA methylation 

The novel P2 promoter had comparable activity to PI in 
MCF7 cells, but unhke PI, was functional in both cell 
Hnes tested (Figure 6a). The minimal P2 promoter maps 
to -2228/-1993bp upstream of the miR-200b 
5' stemloop (Figure 6b). To confirm that DNA methyla- 
tion directly repressed promoter activity, PI and P2 were 
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cloned into a CpG-free reporter construct (Klug and 
Rehli, 2006) and in vitro methylated by Sssl DNA 
methylase. Methylated PI and P2 constructs displayed a 
significant reduction in promoter activity, compared 
with their mock methylated constructs when transfected 
into T47D cells, in which both promoters are endogen- 
ously unmethylated and functional (Figure 7b). This 



suggests that DNA methylation represses miR200b 
cluster promoter activity. 

miR-200b PI and P2 promoters are differentially 
methylated in primary breast tumours 
To study DNA methylation of the miR-200b promoters, 
Sequenom MassArray was performed on Grade 3 FFFE 
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Figure 6 miR-200b cluster has two functional promoters, (a) Promoter activities of miR-200b PI and P2 in MCF7 and MDA-MB- 
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Horizontal bars indicate the location of the miR-200b cluster, GGI and oestrogen response element. 



clinical samples. In all cases, PI and P2 were differen- 
tially methylated in both tumours and lymph nodes 
(Figures 8a and b). In addition, PI, but not P2, was 
hypermethylated in lymph nodes compared with 
matched primary tumours (Figures 8c and d). 

To determine if hypermethylation was associated with 
expression of the miR-200b cluster in primary tumours, 
qPCR for miR-200b was performed on tumour samples 



from which RNA was available. PI hypermethylation 
was associated with loss of miR-200b expression in 
seven out of nine samples (Supplementary Figure 3A) 
whereas P2 was found to be associated with loss of miR- 
200b expression in six out of seven samples tested 
(Supplementary Figure 3B). These suggested that 
hypermethylation of miR-200b cluster promoters could 
regulate miRNA expression in tumours. 
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Figure 7 DNA methylation represses miR-200b PI and P2 activity in breast cancer cells, (a) DNA methylation and miR-200b 
expression levels in a panel of nine breast cancer cell lines. Top: black bars represent the percentage methylation. Bottom: miR-200b 
expression was assessed by qPCR. Expression is shown relative to RNU6B and bars represent the mean±s.e. of two independent 
experiments, (b) Reporter activity of miR-200b PI and P2 methylated by Sssl DNA methylase (white) compared with mock methylated 
plasmids (grey) ± s.d. of two separate experiments. 



Methylation of the miR-200b P2 promoter is associated 
with ER, PR, HER2 and androgen receptor expression in 
primary breast tumours 

To ascertain whether DNA methylation of the miR- 
200b cluster promoters is associated with expression of 
routinely used breast cancer biomarkers, ER, PR and 
HER2, methylation was assessed in patients positive and 
negative for expression of these receptors. Methylation 
of P2, but not PI, was significantly higher in tumours 
that were ER or PR negative (Figures 9a and b, 
respectively). Hypermethylation of P2 was also asso- 
ciated with HER2 positivity (Figure 9c). Androgen 
receptor, a potential breast cancer biomarker (Hu et al., 
2011) and regulator of the miR-200 family, (Xu et al., 
2010; Waltering et aL, 2011) was also associated with 
hypermethylation of P2 (Figure 9d). Although the mir- 
200b cluster is involved in metastasis, which in turn 
affects prognosis, no evidence of an association between 
DNA methylation and survival was found. 



Discussion 

Transcriptional regulation of miRNA genes is poorly 
understood and only a few miRNA promoters have been 
reported. A comprehensive understanding of miRNA 
promoters is a prerequisite for their use as genetic or 
epigenetic biomarkers. In this report, novel CGI- 
associated miRNA promoters were mapped and ana- 
lysed for associations between DNA methylation and 
miRNA expression. In all, 59% of the miRNAs 
examined were associated with a CGI within 5kb 
upstream, similar to the estimated proportion of CGI- 
associated coding genes and was consistent with previous 
estimations for miRNA promoters (Ozsolak et al., 2008; 
Corcoran et al, 2009). Twenty-two novel promoters 
were identified and shown to be active in reporter assays. 
MiR-lOb had a previously described promoter (Ma et al., 
2007; Zhou et al, 2007) immediately upstream of the 
mature miRNA sequence (Figure 2). However, we could 



Oncogene 



4190 



Mapping the sequences controlling 93 breast cancer-associated miRNA genes 

EJH Wee et a I 



a 



T 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1- 

n 13 n u ts It fr It It Tin tiT til rti tu tie tii ti7 ni tit no tzi 122 m m i3e 



■nnnnn 



1 

o.g 

OS 

0.7 

0.6- 
OS 
0.4 
0.3 
0.2 - 
0.1 
Q 



-Tumour 



P<0.OS 



UH UQ mi Uli Ulf UH LNT iNt INI mit UI11 UI1} IHH Wi UH( UI17 Utit Utl» Uta LHQ IJIS 



P=0.05 



d 


1 -1 




0.9- 




O.g- 




0.7- 






5 


O.G- 




0.5- 


1 


0.4' 




0.3 - 




0.2 - 




0.1- 




0- 



^7,8 QiGJl CFKi_12 Ctia_13.14,15 CpG_16,17 



Figure 8 Differential methylation of PI and P2 in clinical samples, (a) LoglO ratios of PI to P2 in 26 primary tumours, (b) LoglO 
ratios of PI to P2 in 23 lymph nodes (LN). Positive values: PI >P2; negative values: PI <P2. Graph heights represent magnitude of 
difference in methylation between PI and P2. (c) Mean methylation profile in matched tumours and LN with horizontal s.e. bars for 
individual CpG units. /-Test P-value as indicated, (d) Box plot of the average methylation in tumours and LN. ( + ): median, box: 25- 
75 percentile, whiskers: max/min, N: sample size, Mann-Whitley P-values as indicated. 



not detect promoter activity for this fragment in the breast 
cancer cells tested (Figure 4c). We were also unable to 
detect any activity in fragments encompassing Core- 
Boost_HM predicted promoter regions for miR-125a in 
the cell lines used (Figures 3 and 4i). A likely explanation 
was that neither cell line expressed miR-125a. 

In all, 7 of 1 5 miRNAs had two or more promoters in 
close proximity, usually at either ends of the associated 
CGI. Although it was not clear how the multiple 
promoters function in regulating their miRNA genes 
or why the promoters were usually at either end of the 
CGI, it was evident that regulation of miRNAs is a 
complex process. The miR-200b cluster (miR-200a, 
miR-200b and miR-429) is an example of a miRNA 
with promoters at either end of the CGI. The PI 
promoter, located at the distal end of the CGI, was 
predicted (Bracken et aL, 2008) based on the presence of 
a 5' EST, the presence of E-Box motifs and the presence 
of a CGI, which is commonly associated with promoters 
of coding genes. Further, a 7.5-kb primary transcript of 
the miR-200b cluster was described using a TCR walk' 
approach and PI promoter activity was demonstrated 
using a luciferase reporter assay. Using a similar 
approach, a novel promoter, P2, is described here. P2 
was predicted ^2.5 Kb downstream of PI (Figure 3) by 
the CoreBoost_HM promoter prediction algorithm 
(Wang et aL, 2009), which utilizes empirical data such 
as ESTs, RNA PolII binding and histone modification 
profiles in addition to DNA motifs associated with core 
promoters to accurately predict active promoter sites. 
We demonstrate that the P2 promoter has an activity 
similar to that of the PI promoter (Figure 6a), is 
functional in breast cancer cell lines (Figures 6a and 7b) 



and is able to drive the expression of miR-200b in its 
endogenous genomic context (Figure 6c). Thus, the P2 
promoter is Hkely to be important in the regulation of 
the miR-200b cluster. Deletion of the P2 minimal 
promoter also reduced miR-200b levels by 50% 
(Figure 6c), and may indicate multiple TSS as previously 
suggested (Wiklund et aL, 2011). In addition, DNA 
methylation of both miR-200b promoters repressed 
miR-200b expression in eight out of nine breast cancer 
cell Hnes studied (Figure 7), suggesting regulation by 
DNA methylation. However, the precise role of PI and 
P2 in regulating the cluster is unclear. In our reporter 
assays, PI and P2 seemed to function independently 
(Figure 6a). In clinical samples, DNA methylation at PI 
was also different compared with P2 in both tumour and 
lymph node metastases (Figures 8a and b), thus 
supporting the hypothesis that the two promoters have 
different regulatory roles. This hypothesis is supported 
by other studies in bladder cancer cells, where a region 
encompassing P2, but not PI, was unmethylated and 
expressed high levels of miR-200b (Wiklund et aL, 201 1). 
Taken together, the evidence suggests that the PI and P2 
transcripts are regulated by different mechanisms and 
this could in turn have a role in regulating metastasis. 

Of the eight cell lines studied, MCF7 did not show an 
inverse association between methylation and miRNA 
expression (Figure 7a). A minority of patients also did 
display a reciprocal relationship between promoter 
methylation and miRNA levels (Supplementary Figure 3). 
In previous studies (Han et al, 2007; Wiklund et aL, 
2011), miRNA repression by DNA methylation is 
usually accompanied by histone modifications asso- 
ciated with gene silencing. Thus, other mechanisms 
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Figure 9 P2 methylation is associated with ER, PR, HER2 and AR receptor status, (a) Methylation status of ER positive (pos) and 
ER negative (neg) cohorts, (b) Methylation status of PR pos and PR neg cohorts, (c) Methylation status of HER2 pos and HER2 neg 
cohorts, (d) Methylation status of AR pos and AR neg cohorts. Left: mean methylation profile with horizontal s.e. bars for individual 
CpG units. /-Test P-value as indicated. Right: box plot of the average methylation in tumours and LN. ( + ): median, box: 25-75 
percentile, whiskers: max/min, N: sample size, Mann-Whitley P- values as indicated. 
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including chromatin remodeling or post-transcriptional 
regulatory events may account for this inconsistency. 
Perhaps these repressive histone marks were absent in 
these cases thus resulting in open chromatin that was 
readily expressed. 

In this study we describe, for the first time, differential 
methylation of the PI and P2 region of the miR-200b 
cluster in breast cancer. The differential methylation is 
functional is evidenced by our observations that DNA 
methylation is inversely associated with miR-200b 
expression in both breast cancer cell lines (Figure 7) 
and clinical samples (Supplementary Figure 3). These 
are consistent with the previously reported tumour 
suppressive role of miR-200b (Korpal and Kang, 2008). 
They are also consistent with previous reports of 
aberrant DNA methylation of the miR-200b cluster 
proximal CGI, containing both PI and P2, in colon, 
bladder and pancreatic cancers (Han et aL, 2007; 
Li et aL, 2010; Wiklund et al., 2011). 

Loss of ER and PR expression was also associated 
with DNA methylation at P2 in breast tumours 
(Figures 9a and b). Patients with tumours that express 
these receptors often have a better prognosis because 
they respond well to treatments such as Tamoxifen. We 
hypothesize that methylation at P2, is Hkely to be 
associated with a lower level of miRNA expression, 
resulting in a more aggressive tumour (Korpal and 
Kang, 2008) that is unresponsive to these therapies and 
generally associated with poor prognosis. Using publicly 
available ER Chip-sequence data (Schmidt et aL, 2010), 
ER bound to a putative ER response element just 
downstream of PI upon ER stimulation in MCF7s 
(Figure 6e). In a microarray study (KHnge, 2009), miR- 
200a and miR-200b were significantly upregulated in 
MCF7 after 6 h of E2 induction. However, in a similar 
independent study, miR-200a and 200c were found to be 
significantly downregulated after 48 h of E2 induction 
(Maillot et aL, 2009). Although the studies seemed to have 
conflicting conclusions, they do suggest a possible regula- 
tory mechanism between ER and the miR-200 family. 

A double negative feedback regulatory relationship 
between the miR-200 family and ZEBl (Bracken et aL, 
2008; Burk et aL, 2008) has been shown to regulate the 
delicate balance between mesenchymal and epithelial 
cellular states. Based on this data, we propose that miR- 
200b is repressed in the early stages of tumourigenesis in 
order to promoter EMT and thus the spread of the 
tumour, followed by later induction of miR-200b to 
promote mesenchymal-epithelial transition and thus 
establishment of the tumour cells at a distant site (for 
example, lymph node). Our data is consistent with this 
as we show only PI was hypermethylated in matched 
lymph nodes compared with their primary tumours. 
This coupled with miRNA repression, suggests a DNA 
methylation mechanism for EMT initiation in addition 
to the previously described TGFB/ZEB pathway. At P2, 
no differential methylation between primary tumours 
and matched lymph node and thus possibly maintaining 
base levels of miR-200 is consistent with the mesench- 
ymal-epithehal transition observed in mouse models. 
Metastatic murine breast cancer cells expressing low 



levels of miR-200 were able to invade distant tissue but 
unable to colonize. However, when miR-200 was over- 
expressed, these cells could form macroscopic tumours 
at distant sites (Dykxhoorn et aL, 2009). Further 
support for this model comes from studies in the 
bladder cancer (Wiklund et aL, 2011) where hypomethy- 
lation of the P2 region was sufficient for miR-200b 
cluster expression. This hypomethylation could also 
possibly account for the elevated levels of the miR-200 
family, observed in other cancers (Hiroki et aL, 2010; 
Li et aL, 2010; Lee et aL, 2011). 

Collectively, the evidence presented here indicates that 
miR-200b cluster regulation is complex and is regulated 
transcriptionally by at least two distinct promoters that 
are sensitive to DNA methylation. The novel P2 
promoter functions independently of PI and can drive 
the expression of miR-200b. However, the precise roles 
of PI or P2 and under what conditions they are utilized is 
still not clear and will require further examination. 



Materials and methods 

Bioinformatics 

A list of 93 miRNAs implicated in breast cancer was generated 
by literature review. Genomic sequences + 5 and — 1 kb of the 
5' stemloop of each miRNA were analysed for CGIs using 
CpGPLot and CpG Island Searcher. Putative promoters are 
defined by a CoreBoost_HM score of at least 0.5 (Wang et aL, 
2009) within this 6kb window. Initial 600-1000 bp fragments 
overlapping the predicted sites were cloned and assayed for 
promoter activity as described later. This process is illustrated 
in Figure 1. 

RNA Polymerase II Chip-sequence data mapped to human 
genome HG18 was obtained from the National Center for 
Biotechnology Gene Expression Omnibus, GEO accession 
number GSE 14664. RNA-sequence data (Wang et aL, 2008) 
mapped to human genome HG18. RNA Polymerase II and 
RNA-sequence data were visualized on integrative genome 
viewer using the data ranges indicated. 

Cell culture 

Breast cancer cell lines MDAMB157, MD AMB23 1 , 
MDAMB436, MDAMB468, MCF7, T47D, ZR75-1, Hs578T 
and BT549 were obtained from American Type Culture 
Collection (ATCC, Manassas, VA, USA) and cultured 
according to the manufacturer's recommendations. 

Patient samples 

Human breast tumours and matching lymph node metastases 
were collected from 56 patients, as approved by local Human 
Ethics committees, who underwent surgical resection and did 
not undergo preoperative radiochemotherapy at Princess 
Alexandra Hospital between 1988 and 2000. All patients were 
female aged from 30 to 94 years old, with a median age of 56 
years. ER, PR and HER2 receptor status of each patient were 
determined by a qualified pathologist. Details are provided in 
the Supplementary Information. 

DNA extractions and purifications 

Genomic DNA from cell lines was extracted using the 
NucleoSpin Tissue Prep kit (Macherey-Nagel, Germany) 
according to the manufacturer's instructions. Plasmid DNA 
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was purified using the Miniprep Kit (Qiagen, Doncaster, VIC, 
Australia). For human tumour samples, four FFPE tumour- 
rich tissue cores (1 x 0.6 mm) were crushed and digested with 
proteinase K at 55 °C for 2 days. Genomic DNA was purified 
using the PureGene kit (Qiagen) according to the manufac- 
turer's instructions. 

Generation of plasmid constructs 

All promoter reporter constructs were cloned into pGL3-Basic 
(Promega, Sydney, NSW, Australia) unless otherwise specified. 
For the in vitro methylation plasmids, PI and P2 fragments 
were cloned into a CpG-free luciferase reporter construct 
pCpG-basic (Klug and RehH, 2006; a gift from Klug and 
Rehli). PGR was performed using KapaHiFi polymerase 
(Kapa Biosystems, Woburn, MA, USA). All constructs were 
confirmed by sequencing. All primers used and cloning details 
are provided in the Supplementary. 

Transfections and reporter assays 

All transfections used a 3-)il:l)ig ratio of Fugene (Roche, 
Castle Hill, NSW, Australia) tranfection reagent to DNA. For 
luciferase assays, either MCF7 or MDA-MB-231 cells were co- 
transfected with 400 ng of promoter construct and 10 ng of 
RL-TK plasmid (Promega) as a transfection control and 
harvested and assayed for reporter activity after 48 h. The 
Dual-Glo luciferase Assay kit (Promega) was used as 
recommended by the manufacturer. Firefly luciferase levels 
were normalized to Renilla luciferase levels and expressed relative 
to pGL3-basic levels (RLU). Statistical analysis was performed 
using unpaired two- tailed ^-test. 

For minigene experiments, MDA-MB-231 cells were grown 
to 60-70% confluence in 6-well plates, and transfected with 

1 |ig of DNA and harvested after 72 h. 

Identification of TSS of Hsa-mir-200b 

Total RNA from MCF7 transfected with the P2 luciferase 
reporter construct was extracted using Trizol (Invitrogen) and 
DNasel (NEB, Ipswich, MA, USA) treated. First strand 
cDNA was reverse transcribed using SuperSciptlll (Invitro- 
gen) using a luciferase specific primer, Rl, at 50 °C. This then 
served as a template for PGR ampHfication. PGR 'walking' 
towards the 5' end was performed using primers A to D with 
R2. All PGR products were visualized on a 1% agarose gel. 
Primer sequences are given in Supplementary Table 3. 
Classical 5'RACE was perform as previously described (2005). 

Quantitation of naiRNAs 

Total RNA was extracted from cell fines using Trizol 
(Invitrogen). RNA from clinical samples was extracted using 
miRNeasy kit (Qiagen). For miR200b and miR335 experi- 
ments, cDNA was made from total RNA using TaqMan 
MicroRNA Reverse Transcription Kit (Appfied Biosystems, 
Mulgrave, VIC, Austrafia) with both reverse transcription 
miRNA and RNU6B (loading control) primers in the same 
reaction. Real-time PGR was performed using the TaqMan 
microRNA Assay (Appfied Biosystems) according to the 
manufacturer's instructions. For all other miRNAs, the Qiagen 
miScript PGR system for miRNA quantification was used with 
the RNU6B loading control. Changes in expression levels were 
calculated using AACt method (Livak and Schmittgen, 2001). 

Bisulfite modification and methylation- sensitive high-resolution 
melt analysis 

2 )ig of DNA extracted from cell lines was subjected to bisulfite 
modification with MethylEasy Xceed kit (Human Genetic 
Signatures, Randwick, NSW, Australia) according to manu- 
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facturer's instructions. PGR amplification and methylation- 
sensitive high-resolution melt analysis (Wojdacz and Dobro- 
vic, 2007) was performed in duplicate on the RotorGene Q 
(Qiagen). Primers were designed according to the principles 
outfined (Wojdacz and Dobrovic, 2007) to control for PGR 
bias and are shown in Supplementary Tables 4 and 5. PGR 
conditions are provided in the Supplementary. Bisulfite treated 
CpGenome Universal Methylated DNA (Chemicon, Milfi- 
pore, Kilsyth, VIC, Australia) and DNA from the appropriate 
cefi fines were used as positive/methylated and negative/ 
unmethylated controls, respectively. WGA DNA made with 
the GenomiPhi kit (Amersham GE Healthcare, Rydalmere, 
NSW, Australia) was used as unmethylated controls for 
miR335 and miR663. Included in the analysis of each region, 
controls were mixed in 25, 50 and 75% methylated to 
unmethylated template ratios. 

In vitro methylation of plasmid DNA 

DNA was methylated using Sssl (NEB) as previously 
described (Klug and Rehfi, 2006). Briefly, plasmids were 
incubated with Sssl (2.5U/|ig) with 160 )iM S-adenosylmethio- 
nine at 37 °C for 4h and supplemented with an additional 
160|iM of S-adenosylmethionine for another 4h at 37 °C. 
Mock methylated plasmids controls were treated similarly but 
without enzyme. Plasmids were recovered by phenol/chrolo- 
form, fofiowed by ethanol precipitation, transfected into T47D 
cefis and luciferase assays performed. 

Sequenom Mass Array 

Genomic DNA from clinical samples was bisulfite converted 
with EZ-96 DNA methylation kit (Zymo Research, Irvine, 
CA, USA). Methylation levels in clinical samples were 
determined using Sequenom MassArray, performed according 
to manufacturer's recommendations for T-cleavage chemistry 
protocol and analysed between a 1 640 and 7000 mass window 
(Coolen et al., 2007). Average methylation of each patient is 
defined as the average percent methylation of all CpG units in 
each amplicon. Average methylation of each CpG cluster (or 
profile) is defined as the average percent methylation of the 
cohort for that specific CpG cluster. In afi, 0-100% methyla- 
tion are represented by 0.0 to 1.0. Primer sequences are 
provided in Supplementary Table 4. 
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